Human Object Interaction Detection


Human-object interaction (HOI) detection is a task of identifying a set of interactions in an image, which involves the localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and the classification of the interaction labels.

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Add code
Feb 02, 2026
Viaarxiv icon

From Sycophancy to Sensemaking: Premise Governance for Human-AI Decision Making

Add code
Feb 02, 2026
Viaarxiv icon

Eye-Tracking-Driven Control in Daily Task Assistance for Assistive Robotic Arms

Add code
Jan 24, 2026
Viaarxiv icon

GlovEgo-HOI: Bridging the Synthetic-to-Real Gap for Industrial Egocentric Human-Object Interaction Detection

Add code
Jan 14, 2026
Viaarxiv icon

Order from Chaos: Physical World Understanding from Glitchy Gameplay Videos

Add code
Jan 23, 2026
Viaarxiv icon

Human detectors are surprisingly powerful reward models

Add code
Jan 21, 2026
Viaarxiv icon

Unlocking Large Audio-Language Models for Interactive Language Learning

Add code
Jan 21, 2026
Viaarxiv icon

Forest-Chat: Adapting Vision-Language Agents for Interactive Forest Change Analysis

Add code
Jan 21, 2026
Viaarxiv icon

TRec: Egocentric Action Recognition using 2D Point Tracks

Add code
Jan 08, 2026
Viaarxiv icon

Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs

Add code
Dec 19, 2025
Figure 1 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 2 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 3 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Figure 4 for Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs
Viaarxiv icon